Cross-modal image retrieval with deep mutual information maximization
نویسندگان
چکیده
In this paper, we study the cross-modal image retrieval, where inputs contain a source plus some text that describes certain modifications to and desired image. Prior work usually uses three-stage strategy tackle task: 1) extracting features of inputs; 2) fusing its modified obtain fusion feature; 3) learning similarity metric between via deep learning. Since classical image/text encoders can learn useful representations common pair-based loss functions distance are enough for people improve retrieval accuracy by designing new networks. However, these methods do not successfully handle modality gap caused inconsistent feature distributions different modalities, which greatly influences To alleviate problem, apply contrastive self-supervised method Deep InfoMax (DIM) [1] our approach bridge enhancing dependence text, image, their fusion. Specifically, narrows maximizing mutual information semantically representations. Moreover, seek an effective subspace consistent images utilizing low-level layer encoder high-level network. Extensive experiments on three large-scale benchmarks show have bridged modalities achieve state-of-the-art performance.
منابع مشابه
Multi-modal volume registration by maximization of mutual information
A new information-theoretic approach is presented for finding the registration of volumetric medical images of differing modalities. Registration is achieved by adjustment of the relative position and orientation until the mutual information between the images is maximized. In our derivation of the registration procedure, few assumptions are made about the nature of the imaging process. As a re...
متن کاملImage Retrieval Using Mutual Information Hou
In this paper, we study an information theoretic approach to image similarity measurement for content-base image retrieval. In this novel scheme, similarities are measured by the amount of information the images contained about one another – mutual information (MI). The given approach is based on the premise that two similar images should have high mutual information, or equivalently, the query...
متن کاملMedical Image Segmentation Based on Mutual Information Maximization
In this paper we propose a two-step mutual informationbased algorithm for medical image segmentation. In the first step, the image is structured into homogeneous regions, by maximizing the mutual information gain of the channel going from the histogram bins to the regions of the partitioned image. In the second step, the intensity bins of the histogram are clustered by minimizing the mutual inf...
متن کاملIntensity Based Image Registration by Maximization of Mutual Information
Biomedical image registration, or geometric alignment of two-dimensional and /or three-dimensional (3-D) image data, is becoming increasingly important in diagnosis, treatment planning, functional studies, and computer-guided therapies and in biomedical research [1]. Registration is an important problem and a fundamental task in image processing technique. In the medical image processing fields...
متن کاملIntensity Based Image Registration by Maximization of Mutual Information
Biomedical image registration, or geometric alignment of two-dimensional and /or three-dimensional (3-D) image data, is becoming increasingly important in diagnosis, treatment planning, functional studies, and computer-guided therapies and in biomedical research [1]. Registration is an important problem and a fundamental task in image processing technique. In the medical image processing fields...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Neurocomputing
سال: 2022
ISSN: ['0925-2312', '1872-8286']
DOI: https://doi.org/10.1016/j.neucom.2022.01.078